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Abstract — We show that any joint probability mass function 
(PMF) can be expressed as a product of parity check factors and 
factors of degree one with the help of some auxiliary variables, 
if the alphabet size is appropriate for defining a parity check 
equation. In other words, marginalization of a joint PMF is 
equivalent to a soft decoding task as long as a finite field can 
be constructed over the alphabet of the PMF. In factor graph 
terminology this claim means that a factor graph representing 
such a joint PMF always has an equivalent Tanner graph. We 
provide a systematic method based on the Hilbert space of PMFs 
and orthogonal projections for obtaining this factorization. 

I. Introduction 

Most of the problems faced in communication systems are 
in the form of marginalization of joint PMFs. If the joint 
PMF is in the form of a product of some local functions 
(factors or interactions) then the marginalization task can be 
accomplished by the sum-product algorithm HI, Q. However, 
the factorization structures of joint PMFs are not apparent 
always. Therefore, a systematic method showing the factor- 
ization structure of joint PMFs proves useful. 

We propose a method for this purpose which is based on 
the Hilbert space of PMFs and orthogonal projections. The 
Hilbert space of PMFs is proposed in our recent work [3| and 
has potential applications one of which is proposed in this 
paper. 

Our proposed method factorizes joint PMFs into soft parity 
check interactions (SPCI). We define an SPCI as a generalized 
form of parity check constraints. A parity check constraint 
guarantees that the weighted sum of the variables included in 
the parity check always equals to zero. However, in SPCIs 
we allow the weighted sum to admit all the values with 
certain probabilities. It is shown that SPCIs sharing the same 
set of parity check coefficients form a subspace. Then the 
factorization of joint PMFs is achieved by projecting them 
onto these subspaces. 

Since our method employs parity checks, it is applicable 
to PMFs of certain alphabet sizes. The alphabet size of the 
random variables should be a prime number or its powers, 
for which a finite field exist. This may seem as a severe 
restriction. However, in the case of communication problems 
this restriction does not cause a big trouble since the alphabet 
sizes in the communication problems are either two or its 
powers usually. 

It is known that the soft decoding operation is a special case 
of the marginalization of joint PMFs. In this work we show 



that the reverse is also true for certain alphabet sizes. In other 
words, we show that marginalization sum can be handled by 
a soft decoder. This soft decoder belongs not to an arbitrary 
code but to the dual code of the Hamming code. 

The paper is organized as follows. In the next section the 
Hilbert space of PMFs will be briefly introduced. Third section 
explains the factorization of joint PMFs in detail. In the fourth 
section, we show that the soft decoder of the dual Hamming 
code can be employed as a universal marginalization machine. 

II. The Hilbert Space of PMFs 

The Hilbert space of PMFs is summarized in this section. 
Readers may refer to [3| for a more detailed explanation of 
the Hilbert space of the PMFs. 

Consider an experiment with a set of outcomes (alphabet) 
A which is discrete and has a finite number of elements. 
The probabilities assigned to these outcomes define a PMF 
such that p{x) = Pr{x} for every x in A. Each different 
assignment of the probabilities to the outcomes defines a 
different PMF. We denote the set of all possible PMFs defined 
over the alphabet A by V4 which is formally defined as 



V A 4 {p(x) : A - [0, 1] : £ p(x) = 1}. 



(1) 



The addition and the scalar multiplication operations are 
necessary to construct an algebraic structure over V4. The 
addition of PMFs is denoted by EH and defined as 



p(x) EH q(x) = —p(x)q(x), 



(2) 



where p(x), q(x) are PMFs in V4 and Z is the normalization 
constant. The scalar multiplication is denoted by □ and is 
given as 



1 



aBp(x)^-(p(x)r 



(3) 



where a is in R and Z is the normalization constant once 
again. This normalization constant is necessary to ensure the 
closure of the Va under the addition and the scalar multiplica- 
tion operations. Hence, its value is Z — '}2 l ^ xe j i P( x ) ( l( x ) f° r 
the case of addition and Z — ^Zy xeA {P^ x )) a ^ or tne case °^ 
scalar multiplication. Note that the PMFs are denoted not only 
by letter p but also by other lower case letters in the paper. 

It can be shown that the set V4 together with the operations 
EH and □ forms a vector space over R Q. 



The geometric structure over this vector space can be 
defined by means of an inner product. This vector space admits 
the following function as an inner product (3 |. 



< p(x),q(x) >: V A x V A 



lo S tt TT lo S ; 



(4) 



where \A\ denotes the cardinality of the set A. This definition 
can be simplified by introducing the following mapping. 



£{p(x)}:V A 




(P(zi)) 1 ' 



(5) 



where x t denotes the i th element of the set A and e.j denotes 
the i th canonical basis vector of MJ* 4 !. Then the inner product 
of PMFs simply becomes 

< p(x),q(x) >= (P)i(q)i =< p-q > ( 6 ) 

2=0 

where p, q are vectors in R'- 4 such that p = £ {p(x)}, q = 
£{q(x)}, and (p)j ((q)j) denotes the i th component of the 
vector p (q). This identity shows that £{.} is an isometric 
transformation from V A to R'' 4 '. 

The mapping £ {.} have further important properties. It is 
linear and one-to-one [3 |. These properties allow us to find the 
dimension of the vector space V A . The dimension of V A is not 
very simple to calculate; whereas, the dimension of the range 
space of the £{.} is. For any p(x) <E V A , let P = £{p( x )} 
then 

Therefore, the range space of £ {.} becomes the set 
{p e M 1 - 41 : (1, 1, . . . l)p = 0}, which is clearly a \A\ - 1 
dimensional subspace of W^K Hence, V A is a \A\ — 1 
dimensional vector space. Moreover, Va is a Hilbert space 
since it is a finite dimensional inner product space. 

A. The Hilbert Space of Joint PMFs 

The Hilbert space structure can be applied to joint PMFs 
of combined experiments as long as each individual experi- 
ments has a finite alphabet. Consider a combined experiment 
consisting of N individual discrete experiments with alphabets 
Ai, A2, ■ ■ ■ , An ■ Then the alphabet of the combined experi- 
ment, which is denoted by S, is 



S = Ai x A2 x 



x A 



N- 



Hence, the alphabet size of the combined experiment is |<S| 



YliLi Consequently, the dimension of this Hilbert space 



is 



JV 



dimVs = JJl^l-1. 



(8) 



i=l 



If all of the individual experiments are defined over the same 
alphabet denoted by A then dim V5 = 1-4]^ — 1. 



III. Factorization of Joint PMFs 

In this section the factorization of joint PMFs is analyzed 
in a systematic way. Let the joint PMF under concern be 
p(x%,X2, ■ ■ ■ ,xn) which is an element of Vs as defined in the 
previous section. Suppose that this joint PMF can be expressed 
as 



A" 



p(xi,X2, ■ ■ ■ ,X N ) = Yl<j>i(Xi) 



(9) 



where XiS are the subsets of the set X = {x%, X2, ■ ■ ■ , Xn} 
and the arguments of the functions <pi(Xi) are the elements 
of X. L . The functions (f>i(Xi)'s are called factor functions or 
interactions. 

The factor functions are not necessarily PMFs in general. 
However, a proper PMF can be defined for each factor function 
by properly scaling them as follows. 



qi(xx,X2, ■ ■ .,X N ) 



Sv(ATi) <t>i{Xi) 



Although qi has all xi,X2, ■ ■ ■ ,xn as arguments in this 
notation, its value is independent of the arguments in X \ Xi 
and it is still a function of the members of Xi only. After this 
scaling (0 can be rewritten as 



1 n 

p(xi,x 2 , ■ ■ -,x N ) = — Y[qi(xi,x 2 , ■ • ■ ,x N ) 



(10) 



Note that p(xi, x 2 , ■ ■ ■ , xn) and x 2 , ■ ■ ■ , xn)s are all 

members of the Hilbert space Vs, and the representation of 
([Tol l in this Hilbert space is 



p(x 1 ,x 2 , ■ ■ . ,x N ) 



l q i {x 1 ,x 2 , ■ ■ -,x N ). 



(11) 



A. Soft Parity Check Interactions 

A random variable is defined as a mapping from the event 
space to the real line. This is also true for discrete experiments 
as well. However, if the number of outcomes of the discrete 
experiment is appropriate, defining a discrete random variable 
as a mapping from event space to a Galois field may inspire 
new ideas. This section is built on this idea. Therefore, in the 
rest of the paper it is assumed that it is possible to make a 
one-to-one matching between the event space and a Galois 
field. In other words, we assume that 



A=GF{\A\), 



(12) 



where GF(|^4|) denotes the Galois field of order |«4|, Fur- 
thermore, it is assumed that combined experiments consist of 
individual experiments with identical event spaces. That is, 

S = A N = G¥ N {\A\). 

Working on Galois fields allows us to define interactions 
(factor functions, joint PMFs) based on algebraic operations. 
An example for such an interaction is the soft parity check 
interaction (SPCI). We define SPCI as follows. 

Definition 1. Soft Parity Check Interaction: A joint PMF 
p(xi,x 2 -, ■ ■ ■ ,xn), in Vs, where S = GF^d^ll), is called a 



soft parity check interaction if there exists a q(x) £ Vgf(|.4|) 
and a vector a = (ai, a 2 , ■ ■ ■ , ««) £ GF^d-^l) iMc/; f/zaf 



p W = I /i|i-i g( axT )' 



1-41 

where x denotes (xi, x 2 , . . . , Xjv) and T denotes transposition. 
Moreover, the vector a is called the parity check coefficient 
vector of the SPCI and the weight of this vector is called the 
order of the SPCI p(x). 

As its name implies, an SPCI, relates the random variables 
by a parity check equation. The term "soft" arises from the 
fact that the parity check equation is not guaranteed to be 
satisfied. That is, the weighted sum of the random variables 
has a probability distribution rather than being guaranteed to 
be zero. 

Example 1. Letpi(xi,x 2 ) and p 2 (xi, x 2 ) be two PMFs which 
are given, with a slight abuse of notation, as 



Pi(xi,x 2 ) 
p 2 (xi,x 2 ) 



0.2 0.1 
0.1/3 0.2 
0.1 0.1/3 

144 18 
3 18 
3 4 



1 

238 



0.1/3 
0.1 
0.2 

6 

36 
6 



where i th row and j th column of the matrices represent the 
value of pi,2(xi = i — l,x 2 = j — 1). In this example 
pi(xi,x 2 ) = (l/3)g(xi + 2x 2 ) where q(x) = [0.6 0.1 0.3] 
with a similar abuse of notation. Hence pi(x\, x 2 ) is an SPCI. 
On the other hand p 2 (xi,x 2 ) is not an SPCI since such an 
expression is not possible for it. 

The SPCIs have some important properties. Firstly, the 
marginal functions associated with an SPCI will be investi- 
gated. If the order of the SPCI is one then the i th marginal 
function becomes 



E 



1 



1-41 



JV- 



T^ax ) 



q{diXi 
l 

l-4| 



a t ^0 
otherwise 



In other words, SPCIs of order one provide local evidence 
about the variable whose associated coefficient is nonzero. If 
the order is greater than one then 



E 



1 / Ts 1 



(13) 



for all i £ {1, 2, . . . , N}, which means SPCIs of order greater 
than one do not provide any local evidence. However, these 
SPCIs provide information when used together with other 
SPCIs. Hence, we say that SPCIs of order greater than one 
provide purely extrinsic information. 

Secondly, in a sum-product algorithm point of view, mes- 
sage computation for SPCIs is less complex. In general, for 
a factor function in Vs, the message computation complexity 
is \A\ H). The reduced complexity message computation 
algorithm for low-density parity-check decoding presented in 



lID is directly applicable to SPCIs as well. Hence, message 
computation for an SPCI is N\A\ log \A\. 

Finally and the most importantly, the set of SPCIs sharing 
the same parity check coefficients, as stated by Theorem Q] 
is a subspace of Vs. The set of SPCIs with the parity check 
coefficient vector a is denoted by VJ and defined as follows. 



p(x) 



1-41 



^ rT g(ax T ) : q(x) £ V G F(|A|) j 



Theorem 1. For any nonzero a in GF^d^j), V| is a \A\ — 1 
dimensional subspace of Vs- 

Proof: For each a, we can define the following mapping. 

1 



T a {q(x)} : V G f(U| 



\A\ 



N- 



-q(ax ) 



Clearly this mapping is one-to-one and it can be easily shown 
that it is also linear. It is well known from linear algebra that 
the range space of a linear mapping is a subspace of the co- 
domain. Moreover, if the mapping is one-to-one the dimension 
of the range space is equal to the dimension of the domain of 
the mapping. Hence, 



dimVj = dimV G F(|^|) = \A\ - 1 



(14) 



Now the relations between two different subspaces defined 
by two different parity check coefficient vectors can be investi- 
gated. These relations are explained by the following theorems. 

Theorem 2. For any two nonzero parity check coefficient 
vectors a and b in GF^d^j), V| = Vj ifa= ab for an a 
in GF(|^|). 

Proof: For any p(x) in Vf there exist a qi(x) in Vgf(|.4|) 
such thatp(x) = qi(ax T ). Let (? 2 (x) = qi(ax). Clearly <? 2 (x) 
is in Vgf(|^|). Then, 



p{x) = 



N -gi(abx ) - 
Therefore, p(x) is also an element of V!?. Hence, 



T<72(bx ). 



if a = ab. 



Theorem 3. For any two nonzero parity check coefficient 
vectors a and b in GF^d.AQ, the subspace Vf is orthogonal 
to the subspace if a ^ ah for any a in GF(|^l|). 

Proof: For any pi(x) £ V§ and p 2 (x) € Vj, the inner 
product of these two SPCIs is 

< Pi(x),p 2 (x) >= 

irv n Vy pi(y) s n Vy p 2 (y) 



Let gi(ax T ) = ^-^(x) and g 2 (bx T ) = ^^"^(x). 
Then the inner product can be rewritten as 

< Pi(x),pa(x) >= 

bg ( gl (ax r ))^l w } bg (<? 2 (bx r ))(W 



n Vy 9i(ay T ) 



nv y ? 2 (by r ) 



In order to simplify the notation we can use operator C {.}. 
Let qi = £{qi(x)} and q 2 = £{q 2 (a;)}. Then the inner 
product can be simplified as 



<p x (x),p 2 (x) >= |-4 



2 iV- 2 



Vx 



ljax T 



(q 2 ) 



bx T > 



where the constant |_4| 2Ar-2 arises from the differences be- 
tween the alphabet sizes of S and GF(|^4|). Then, for some 
dummy variables ci, c 2 in GF(|^4|) the summation above can 
be grouped as follows. 



<Pi(x),p 2 (x) > 



1-41 



2 7V- 2 



E E (qi)ci(q2) ca 

Vci Vc 2 Vxe/c 

E f E ((*)- E 1 

Vci \ Vc 2 \ VxGK / 

El^^Efeki^i] 



Vci \ Vc 2 / 

where K = {x e GF W (|.4|) : ax T = c x A bx T = c 2 }. If a 
was equal to ah then there were either l^lj^ -1 or no x vectors 
satisfying the conditions of set JC depending on the values of 
Ci and c 2 . However, since a is not a scaled version of b there 
are always .A^ -2 elements in K regardless of the values of 
ci and c 2 . Hence, the inner product becomes 

<p 1 (x),p a (x)> = \A\ 3lf -* ( ^(q x ) ci J fe(q 2 )c 2 J 

\Vci / \Vc 2 / 

= o, 

where the last line follows from ||7). Finally, the subspace 
is orthogonal to since any pi(x) in VjJ is orthogonal to 
any p 2 (x) in Vg. ■ 
The next question to be asked after Theorem [3] is what the 
number of different subspaces is. This question is equivalent to 
asking the number of distinct vectors in GF (|.A|) such that 
every pair of vectors are linearly independent. Note that the 
answer to this question is equal to the number of columns of a 
parity check matrix of a Hamming code defined over GF(|^4|) 
having N rows. As explained in (5), the number of distinct 

vectors in GF^Q.AI) which are pairwise linearly independent 

141N-1 

is T^pi an d so i s tne number of distinct subspaces. Then 
we can state the following theorem. 

Theorem 4. Let ai, a 2 , . . . , slm be pairwise linearly inde- 

-. Then the 

y% M is 



_ \A\ 



pendent vectors in GF (|*4|) where M 
orthogonal direct sum of the subspaces Vg 1 , V s 2 , 



equal to Vs- In other words 



M 



G5) 



Proof: The orthogonal direct sum of subspaces is a 
subspace. Hence, the right hand side of the equation above 
is a subspace of V5 and its dimension is given as 

M M 

dim0 V| j = ^ dim V* ! = \A\ N - 1 (16) 

8=1 1=1 

due to TheoremQ] As explained in Section HI-Al the dimension 



of the V s is also \A\ N - 1. Consequently, V 5 = ®f =1 V 



s ■ 



This theorem has important consequences. Any joint PMF 
p(x) can be projected onto the subspaces Vj*s by using the 
inner product. Theorem [4] states that the vector summation of 
these projections is equal to the original joint PMF. In other 
words 

p(x) = p ai (x) ffl p a2 (x) ffl . . . ffl p aM (x) 



1 M 



(17) 



where the last line follows from the definition of the ffl 
operation and p ai (x) denotes the projection of p(x) onto the 
subspace V^*. These projections can be calculated by 

1-41-1 

p a< (x)= Y <p(x),^(x) > □^(x), (18) 
»=i 

where V>ij(x) denotes the j th orthonormal basis PMF of the 
i th subspace. Moreover, since p &i (x)s are SPCIs we can write 

p(x) as 



1 M 

pW = -jW^i^) 



(19) 



where all scaling coefficients are merged in Z and (^(a^x) = 

Example 2. Consider the p 2 (xi,x 2 ) given in Example 1. It 
can be factorized as 

P2(xi,x 2 ) = ■^qi{x 1 )q 2 {x 2 )qz{xi + x 2 )q±{xi + 2x 2 ) 

where qi (x) = ^[6 3 1], q 2 (x) = |[1 1 1], q 3 {x) = ±[4 1 1], 
and q±{x) = j^[6 1 3]. Actually, we can omit writing q 2 {x 2 ) 
since it is a constant. 

B. Parity Check Interactions 

Any SPCI can be transformed into usual parity check 
factor function, which is nothing but an indicator function, 
by employing an auxiliary variable in GF(|^4|) as follows. 

fS^riT E I(*x T -u)q(u), (20) 

11 11 VueGF(|^t|) 

where I(x) is the indicator function and its value is one if x — 
and zero otherwise. This transformation allows expressing 




Fig. 1. Tanner graph of P2(xi,X2) given in Examples 1,2, and 3. 



any joint PMF as a product of parity check factors and factors 
of degree one. 

N of the parity check coefficient vectors of the SPCIs 
in ( fT9] > can be selected as the N canonical basis vectors of 
GF^fl^l). Then the product in (O can be grouped as 



N 



A I 



K x ) = ^n^c^i) n *( a « xT )- ( 2i ) 

i=l i=N+l 

The second product above can be transformed into parity check 
constraints using (l20t as follows. 

N M 



1=1 



Vu i=JV+l 



where u denotes (ui, U2, ■ ■ ■ , um-n)- Let r(x, u) be a PMF 
defined over GF M (|„4|) as follows. 



"(X,U) =- f jj ) ( IT Qi(Ui-N)\ 

Vi=l / \*=iV+l / 

/ M \ 

• n * 



(22) 



\i=JV+l / 

Clearly, p(x) = X)vu r ( x ' u )- Hence, r(x, u) carries all the 
information that p(x) has for a;,'s. As d22"l ) displays, r(x, u) 
can be expressed as a product of parity check factors and 
factors of degree one which was our goal. Note that this 
factorization can be represented by a Tanner graph. 

Example 3. The Tanner graph of p 2 (xi,x 2 ) in the previous 
examples is shown in Figure [TJ which represents the following 
factorization. 

r(xi,x 2 ,ui,u 2 ) =1(2:1 + x 2 - u 1 )I(x 1 + 2x 2 - u 2 ) 

■ qi{xi)q2{x2)%{ux)qA{u 2 ). 

IV. Universal Marginalization Machine 

The third product in ( |22] | represents parity check constraints 
imposed by a linear code. The value of this product evaluates 
as 

M 



] J I(a t .x T - Ui_7v) 

i=N+l 

where the matrix H is 

a N+ i -1 
a/v+2 —1 



1, H[x u] = 
0, otherwise 



H 







[p -I] 



(23) 



The generator matrix G of this code is [I P T ] . Remember 
that the vectors ajv+i, &N+2, ■ ■ ■ , &m were all pairwise lin- 
early independent. Moreover, these vectors are also linearly 
independent with the columns of the identity matrix, since the 
weights of these vectors are two or more. Hence, all columns 
of G are pairwise linearly independent, which means that G 
is the parity check matrix of a Hamming code. Therefore, H is 
the parity check matrix of the ( 7^r_i 1 N) the dual Hamming 
code. 

If a soft decoder for this code existed, which gives the exact 
marginal a posteriori PMFs for each code symbol, then this 
soft decoder can be utilized to compute the marginal PMFs 
of N random variables having any joint PMF. Hence, we call 
such a soft detector as the universal marginalization machine 
(UMM). The UMM can be configured to marginalize a joint 
PMF by applying certain ^(x^'s and gi(ui_jv)'s as inputs to 
the UMM. 

This approach shows that the marginalization sum, which 
is the central part of the many communication problems, can 
be handled by a soft decoder. This is an important result 
in a practical point of view, since soft decoders can be 
approximated by analog VLSI structures (6). For instance an 
analog equalizer can be implemented in this way. 

V. Conclusion and Future Directions 

In this paper we have presented a method for factorizing 
joint PMFs into parity check factors. This factorization allows 
marginalizing a joint PMF by the soft decoder of the dual 
Hamming code if a Galois field exists in the order of the 
alphabet size of the PMF. 

This work may be continued by extending the idea to the 
alphabet sizes for which a Galois field does not exist. Another 
interesting topic to work on might be employing the fast 
Fourier transform algorithm for obtaining the projections. 
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